Initial commit: Rusty AI SDK - unified multi-provider AI framework by undivisible · Pull Request #1 · undivisible/rs_ai

undivisible · 2026-03-31T12:56:02Z

Summary

This is the initial commit of the Rusty AI SDK, a comprehensive Rust framework for building AI applications with support for multiple providers (OpenAI, Anthropic Claude, Google Gemini, Ollama, and local runtimes) and unified abstractions for language models, embeddings, streaming, and tool calling.

Key Changes

Core Framework (`rusty_ai`)

Trait-based abstractions: LanguageModel, EmbeddingModel, Provider traits for pluggable implementations
Unified types: Prompt, Message, ContentPart, StreamEvent, GenerateResult for consistent API across providers
Streaming support: AiStream with StreamEvent enum for real-time response handling
Tool calling: ToolDefinition, ToolCallRequest, ToolChoice for structured function invocation
Extended thinking: ThinkingConfig enum supporting Anthropic adaptive thinking, Gemini budget-based, and Ollama reasoning modes
Capabilities system: Capability and CapabilitySet for runtime feature detection
Router: Dynamic request routing based on prompt/options conditions

Provider Implementations

OpenAI-compatible adapter (rusty_openai_compatible): Generic HTTP API adapter with SSE streaming and tool-call accumulation
ChatGPT wrapper (rusty_chatgpt): Pre-configured OpenAI client with well-known model constants
Claude (rusty_claude): Anthropic Messages API with streaming, system message separation, and thinking support
Gemini (rusty_gemini): Google Gemini with SSE streaming and structured output
Ollama (rusty_ollama): Local Ollama server integration with chat and embedding support
Local runtimes: Bridges for Gemini Nano (Android), Apple Foundation Models, Windows Phi Silica, and browser AI APIs

Utilities

Middleware system (rusty_middleware): Composable request/response interceptors (logging, caching, retry)
UI stream protocol (rusty_ui_stream): Frontend-friendly event format with SSE and NDJSON encoders
Testing utilities (rusty_testing): Mock models and providers with call recording for unit tests

Examples

Basic text generation, streaming, multimodal input, tool loops
Object/structured output generation and streaming
Router-based model selection
Local runtime integration patterns (Android, Apple, Windows)

Notable Implementation Details

Async-first design: All I/O operations use async_trait and tokio
Stream composition: Uses futures::stream for efficient event processing and transformation
Error handling: Unified AiError type with provider-specific context
Type safety: Serde-based serialization with careful null-handling for optional fields
Extensibility: Bridge pattern for platform-specific integrations (JNI, Swift, Win32)
Workspace structure: Modular crate organization with shared dependencies via workspace manifest

https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

Complete Cargo workspace with unified AI SDK architecture: Core crates: - rusty_ai: traits (LanguageModel, EmbeddingModel, Provider, Tool, Middleware), typed errors, streaming (futures::Stream), structured output, routing - rusty_middleware: retry with backoff, logging, caching, middleware chain - rusty_ui_stream: SSE + NDJSON encoders, versioned UI protocol - rusty_testing: mock models/providers, stream assertions Cloud providers: - rusty_openai_compatible: generic OpenAI-compatible API adapter - rusty_chatgpt: OpenAI ChatGPT (GPT-4o, o3-mini) - rusty_claude: Anthropic Messages API (Sonnet, Opus, Haiku) - rusty_gemini: Google Gemini API with multimodal support - rusty_ollama: local Ollama server with NDJSON streaming Local/platform runtimes (bridge-based, first-class): - rusty_gemini_nano: Android Prompt API with session support - rusty_foundationmodels: Apple Foundation Models - rusty_phi_silica: Windows NPU Phi Silica - rusty_browser: Chrome/Edge built-in AI for WASM targets All crates compile cleanly against workspace dependencies. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

Expanded mock bridge implementations and routing demonstrations. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

…d output fixes Core (rusty_ai): - Add ThinkingConfig enum (Adaptive, Budget, Enabled) and ReasoningEffort to GenerateOptions - Add ThinkingDelta and SyntheticStreamingNotice stream events - Add ExtendedThinking, VideoInput, AudioInput, AudioOutput Capability variants - SyntheticStreamer now emits SyntheticStreamingNotice before text chunks rusty_claude: - Fix ImageSource to support both base64 and URL sources (no more [image: url] fallback) - Add structured output via output_config.format (json_schema, GA 2026 API) - Add extended thinking via thinking field (adaptive mode) - Add ThinkingDelta/SignatureDelta stream parser handling - Update model IDs: claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251001 rusty_gemini: - Update models to gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite - Add ThinkingConfig (thinking_budget/thinking_level) to GenerationConfig - Add id field to FunctionCall/FunctionResponse (Gemini 3+ requirement) - Add Thought part variant for thinking token streaming - Add responseSchema/responseMimeType for structured output rusty_ollama: - Add think: Option<bool> to chat request (reasoning models: deepseek-r1, qwen3) - Add thinking field to response (streaming + non-streaming) - Pass full JSON Schema as format field for structured output (Ollama 2025+) - Emit ThinkingDelta events from NDJSON stream parser https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

…uctured output rusty_claude: - Update models: claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251001 - Add ExtendedThinking + StructuredOutput capabilities - Fix stream_parser: handle ThinkingDelta and SignatureDelta events - Fix convert: pass thinking config and output_config to API rusty_gemini: - Update models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite - Add ExtendedThinking, VideoInput, AudioInput capabilities - Add thinking_config (budget/level) to generation config - Add id field to FunctionCall/FunctionResponse (Gemini 3+ compat) - Add Thought GeminiPart variant; emit ThinkingDelta from stream parser rusty_ollama: - Pass full JSON Schema as format field for structured output - Add think flag propagation; emit ThinkingDelta from NDJSON stream rusty_chatgpt: - Add gpt-5.4, gpt-5.4-mini, gpt-5.4-nano models - Add gpt54() and gpt54_mini() convenience methods rusty_phi_silica: - Add stream_tokens() to bridge trait (maps to GenerateResponseWithUpdatesAsync) - Replace SyntheticStreamer with real chunk-based streaming in model rusty_browser: - Add BackingModel enum (GeminiNano vs PhiSilica) - Add response_constraint to BrowserAiOptions (Chrome Prompt API) - Add supports_response_constraint capability flag - Update docs: window.ai deprecated, use LanguageModel global directly rusty_ui_stream: - Handle ThinkingDelta and SyntheticStreamingNotice events in encoder https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

rusty_browser: update BrowserAiBridge doc comments noting window.ai deprecation in Chrome 138+, direct LanguageModel global usage, and Edge/Phi Silica backing distinction rusty_phi_silica: fix stream() to drive bridge.stream_tokens() directly instead of calling generate() and wrapping in SyntheticStreamer https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

- All providers now use pub const for well-known model IDs - Added Gemini 3 series: gemini-3.1-pro-preview, gemini-3-flash, gemini-3.1-flash-live-preview, gemini-embedding-2-preview - Provider trait gains fetch_models() for dynamic API discovery - GeminiProvider::list_remote_models() queries /v1beta/models - ChatGptProvider::list_remote_models() queries /v1/models - OllamaProvider already had list_models() via /api/tags https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

Core: - Add SpeechToTextModel + TextToSpeechModel traits with TranscriptionResult, AudioResult, TtsOptions types - Add ModelRegistry for caching dynamically fetched models - Provider trait gains speech_to_text_model(), text_to_speech_model(), fetch_models() methods - RouteCondition type alias fixes clippy::type_complexity - StreamEvent::SyntheticStreamingNotice for local runtime awareness Providers: - ChatGPT: add WHISPER, TTS, TTS_HD, GPT_4O_REALTIME, GPT_4O_AUDIO, GPT_4O_MINI_REALTIME consts + AudioInput/AudioOutput capabilities for voice models - All providers: rename consts to _LATEST suffix pattern with docs pointing users to fetch_models() for dynamic discovery CI: - Add .github/workflows/ci.yml with check, test, clippy (-Dwarnings), fmt, doc, and MSRV (1.75) jobs Quality: - Fix all clippy warnings across entire workspace - Run cargo fmt --all - GeminiRequestParts struct replaces complex return tuple https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

- MSRV 1.80 insufficient for transitive deps; bump to 1.85 - Fix unresolved rustdoc links in middleware.rs, provider.rs, and rusty_ui_stream lib.rs https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

Router::local_first was always returning true from its route condition, making the cloud fallback unreachable. The closure now checks whether the local model's CapabilitySet satisfies the request's needs (tool calling, structured output) and falls through to cloud when it doesn't. CacheMiddleware was keying on the prompt alone, so requests with the same prompt but different temperature, tools, output schema, or other generation options incorrectly returned the same cached result. The key now hashes all generation-affecting fields (numeric options via bit patterns, serializable types via JSON, enum variants via Debug). Tests added for both: four router routing scenarios and four cache hit/miss/TTL scenarios using MockLanguageModel. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Stable rustfmt versions differ between local (1.93) and CI runners, causing spurious fmt failures. Nightly rustfmt is the conventional choice for CI formatting checks. Also clear RUSTFLAGS for the fmt job. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

Root cause: no rustfmt.toml meant different rustfmt versions (local 1.93 vs CI stable) produced different output. Adding rustfmt.toml with edition="2021" ensures deterministic formatting regardless of toolchain version. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

Four distinct silent-failure patterns fixed across all streaming providers: 1. SSE/NDJSON parse failures now terminate the stream with StreamError instead of logging a warning and continuing as if no data was lost. Affected: Claude (parse_sse_event), Gemini, Ollama (build_ndjson_stream), OpenAI-compatible. 2. Malformed tool-call JSON (accumulated from streaming deltas) now emits a StreamError/Error event rather than silently substituting an empty arguments object {}. Affected: Claude (ContentBlockStop), OpenAI-compatible (flush_pending_tools and inline flush). 3. Transport error source chain was being dropped (source: None) in the Claude and Gemini byte-stream error paths. The original reqwest::Error is now preserved via source: Some(Box::new(e)). 4. The OpenAI-compatible stream parser had an unreachable .unwrap() on a HashMap re-query to extract a call_id that was already bound as `_id` in the pattern match. Fixed to use the bound variable directly, removing the underscore suppressor. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Two issues fixed for every provider's non-2xx error path: 1. .unwrap_or_default() on response.text() silently produced a blank error message when the body couldn't be read (e.g. connection reset mid-response). All providers now log a warning and include the read-failure reason in the returned ProviderError message. 2. GeminiProvider::list_remote_models and ChatGptProvider::list_remote_models were discarding the HTTP status code (status: None) after checking it, preventing the retry middleware from distinguishing 429 from 500. Both now capture and forward status: Some(status_code). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…sion RetryMiddleware: replace .expect() on last_error with .unwrap_or_else returning a descriptive Transport error, so no panic occurs if the loop invariant is somehow violated. LoggingMiddleware: success and error paths now both respect the configured tracing level (previously error path always used ERROR, ignoring with_level() settings). Error path now uses ?e (Debug format) to preserve the full error source chain; Display was silently dropping the underlying reqwest/IO cause. CacheMiddleware: cache_key() now returns Option<u64>. If the prompt cannot be serialized (returning None), process() bypasses the cache entirely rather than hashing an empty DefaultHasher state, which previously caused every un-serializable prompt to collide on a single constant hash bucket. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

generate_text returned AiError::Serialization when the model responded with no text (tool-calls-only response). This is a provider response characteristic, not a serialization error. Now returns ProviderError with a clear message including the provider_id. Doc comment updated to describe the failure mode. ThinkingConfig::Adaptive doc comment referenced 'claude-opus-4-6+' which is not a valid Anthropic model identifier. Replaced with a correct description: 'claude-3-7-sonnet and later'. OpenAiCompatibleModel::new() now delegates to try_new() which returns AiResult<Self>, mapping the reqwest build failure to AiError::Transport with the source chain preserved. new() wraps it with a descriptive .expect() that names the actual failure condition (TLS unavailable). Callers that need error recovery can use try_new() directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude and others added 16 commits March 30, 2026 10:06

fix: improve local_windows and router examples

6485400

Expanded mock bridge implementations and routing demonstrations. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

fix: bump MSRV from 1.75 to 1.80 (LazyLock requires 1.80)

87e95b1

https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

fix: bump MSRV to 1.85, fix broken doc links

016af56

- MSRV 1.80 insufficient for transitive deps; bump to 1.85 - Fix unresolved rustdoc links in middleware.rs, provider.rs, and rusty_ui_stream lib.rs https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq

undivisible merged commit 4d44ac2 into m Apr 23, 2026
10 of 12 checks passed

undivisible deleted the claude/rusty-ai-sdk-design-7bT48 branch April 23, 2026 10:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial commit: Rusty AI SDK - unified multi-provider AI framework#1

Initial commit: Rusty AI SDK - unified multi-provider AI framework#1
undivisible merged 16 commits into
mfrom
claude/rusty-ai-sdk-design-7bT48

undivisible commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

undivisible commented Mar 31, 2026

Summary

Key Changes

Core Framework (rusty_ai)

Provider Implementations

Utilities

Examples

Notable Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Core Framework (`rusty_ai`)